18 research outputs found

    Code Generation for Efficient Query Processing in Managed Runtimes

    Get PDF
    In this paper we examine opportunities arising from the conver-gence of two trends in data management: in-memory database sys-tems (IMDBs), which have received renewed attention following the availability of affordable, very large main memory systems; and language-integrated query, which transparently integrates database queries with programming languages (thus addressing the famous ‘impedance mismatch ’ problem). Language-integrated query not only gives application developers a more convenient way to query external data sources like IMDBs, but also to use the same querying language to query an application’s in-memory collections. The lat-ter offers further transparency to developers as the query language and all data is represented in the data model of the host program-ming language. However, compared to IMDBs, this additional free-dom comes at a higher cost for query evaluation. Our vision is to improve in-memory query processing of application objects by introducing database technologies to managed runtimes. We focus on querying and we leverage query compilation to im-prove query processing on application objects. We explore dif-ferent query compilation strategies and study how they improve the performance of query processing over application data. We take C] as the host programming language as it supports language-integrated query through the LINQ framework. Our techniques de-liver significant performance improvements over the default LINQ implementation. Our work makes important first steps towards a future where data processing applications will commonly run on machines that can store their entire datasets in-memory, and will be written in a single programming language employing language-integrated query and IMDB-inspired runtimes to provide transparent and highly efficient querying. 1

    Generating code for holistic query evaluation

    Get PDF
    Abstract — We present the application of customized code generation to database query evaluation. The idea is to use a collection of highly efficient code templates and dynamically instantiate them to create query- and hardware-specific source code. The source code is compiled and dynamically linked to the database server for processing. Code generation diminishes the bloat of higher-level programming abstractions necessary for implementing generic, interpreted, SQL query engines. At the same time, the generated code is customized for the hardware it will run on. We term this approach holistic query evaluation. We present the design and development of a prototype system called HIQUE, the Holistic Integrated Query Engine, which incorporates our proposals. We undertake a detailed experimental study of the system’s performance. The results show that HIQUE satisfies its design objectives, while its efficiency surpasses that of both wellestablished and currently-emerging query processing techniques. I

    Updating Recursive XML Views of Relations

    Get PDF
    This paper investigates the view update problem for XML views published from relational data. We consider XML views defined in terms of mappings directed by possibly recursive DTDs, compressed into DAGs and stored in relations. We provide new techniques to efficiently support XML view updates specified in terms of XPath expressions with recursion and complex filters. The interaction between XPath recursion and DAG compression of XML views makes the analysis of XML view updates rather intriguing. In addition, many issues are still open even for relational view updates, and need to be explored. In response to these, on the XML side, we revise the notion of side effects and update semantics based on the semantics of XML views, and present efficient algorithms to translate XML updates to relational view updates. On the relational side, we propose a mild condition on SPJ views, and show that under this condition the analysis of deletions on relational views becomes PTIME while the insertion analysis is NP-complete. We develop an efficient algorithm to process relational view deletions, and a heuristic algorithm to handle view insertions. Finally, we present an experimental study to verify the effectiveness of our techniques. 1

    Just-in-time compilation for SQL query processing

    No full text
    Just-in-time compilation of SQL queries into native code has recently emerged as a viable alternative to interpretation-based query processing. We present the salient results of research in this fresh area, addressing all aspects of the query processing stack. Throughout the discussion we draw analogies to the general code generation techniques used in contemporary compiler technology. At the same time we describe the open research problems of the area. 1

    Data Provenance and Trust

    No full text
    The Oxford Dictionary defines provenance as “the place of origin, or earliest known history of something.” The term, when transferred to its digital counterpart, has morphed into a more general meaning. It is not only used to refer to the origin of a digital artefact but also to its changes over time. By changes in this context we may not only refer to its digital snapshots but also to the processes that caused and materialised the change. As an example, consider a database record r created at point in time t0; an update u to that record at time t1 causes it to have a value r’. In terms of provenance, we do not only want to record the snapshots (t0, r) and (t1, r’) but also the transformation u that when applied to (t0, r) results in (t1, r’), that is u(t0, r) = (t1, r’)

    Maximizing the Output Rate of Multi-Join Queries over Streaming Information Sources

    No full text
    Recently there has been a growing focus in the research community on join query evaluation for scenarios in which input characteristics may not be entirely known and inputs enter the system at highly variable and unpredictable rates. The proposed solutions to date rely upon some combination of streaming binary operators and “on-the-fly ” query plan reorganization to deal with this unpredictability. In this paper, we consider a different approach, and propose a multi-input streaming join algorithm we call MJoin. We show through experiments with a prototype implementation that in many instances the MJoin produces outputs sooner than any tree of binary operators, and that it adapts well to changing input parameters without query plan modification. This suggests that the MJoin operator may be a useful addition to systems that evaluate queries containing joins over streaming inputs.

    Rate-Based Query Optimization for Streaming Information Sources

    No full text
    Relational query optimizers have traditionally relied upon table cardinalities when estimating the cost of the query plans they consider. While this approach has been and continues to be successful, the advent of the Internet and the need to execute queries over streaming sources requires a different approach, since for streaming inputs the cardinality may not be known or may not even be knowable (as is the case for an unbounded stream.) In view of this, we propose shifting from a cardinality-based approach to a rate-based approach, and give an optimization framework that aims at maximizing the output rate of query evaluation plans. This approach can be applied to cases where the cardinality-based approach cannot be used. It may also be useful for cases where cardinalities are known, because by focusing on rates we are able not only to optimize the time at which the last result tuple appears, but also to optimize for the number of answers computed at any specified time after the query evaluation commences. We present a preliminary validation of our rate-based optimization framework on a prototype XML query engine, though it is generic enough to be used in other database contexts. The results show that rate-based optimization is feasible and can indeed yield correct decisions

    Processing Declarative Queries Through Generating Imperative Code in Managed Runtimes

    No full text
    Abstract The falling price of main memory has led to the development and growth of in-memory databases. At the same time, language-integrated query has picked up significant traction and has emerged as a generic, safe method of combining programming languages with databases with considerable software engineering benefits. Our perspective on language-integrated query is that it combines the runtime of a programming language with that of a database system. This leads to the question of how to tightly integrate these two runtimes into one single framework. Our proposal is to apply code generation techniques that have recently been developed for general query processing. The idea is that instead of compiling quereies to query plans, which are then interpreted, the system generates customized native code that is then compiled and executed by the query engine. This is a form of just-in-time compilation. We argue in this paper that these techniques are well-suited to integrating the runtime of a programming language with that of a database system. We present the results of early work in this fresh research area. We showcase the opportunities of this approach and highlight interesting research problems that arise
    corecore